Agents
The Download: OpenAI is building a fully automated researcher, and a psychedelic trial blind spot
Plus: OpenAI is also creating a super app. OpenAI has a new grand challenge: building an AI researcher--a fully automated agent-based system capable of tackling large, complex problems by itself. The San Francisco firm said the new goal will be its "north star" for the next few years. By September, the company plans to build "an autonomous AI research intern" that can take on a small number of specific research problems. The intern will be the precursor to the fully automated multi-agent system, which is slated to debut in 2028. In an exclusive interview this week, OpenAI's chief scientist, Jakub Pachocki, talked me through the plans.
- North America > United States > California > San Francisco County > San Francisco (0.25)
- North America > United States > Massachusetts (0.05)
- Europe (0.05)
- Asia > China (0.05)
Machine learning framework to predict global imperilment status of freshwater fish
Researchers spent five years developing an AI-based model to protect freshwater fish worldwide from extinction, with a particular focus on identifying threats to fish before they become endangered. "People sometimes go in to protect species when it's already too late," said Ivan Arismendi, an associate professor in Oregon State University's Department of Fisheries, Wildlife, and Conservation Sciences. "With our model, decision makers can deploy resources in advance before a species becomes imperiled." The findings were recently published in the journal Nature Communications. Nearly one-third of freshwater fish species face possible extinction, threatening food supplies, ecosystems and outdoor recreation.
- North America > United States > Oregon (0.30)
- North America > United States > Maine (0.06)
- Europe > Spain > Catalonia (0.05)
- (2 more...)
- Information Technology > Communications > Social Media (0.74)
- Information Technology > Artificial Intelligence > Natural Language (0.73)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.32)
Meta AI agent's instruction causes large sensitive data leak to employees
The data leak triggered a major internal security alert inside Meta. The data leak triggered a major internal security alert inside Meta. Fri 20 Mar 2026 02.00 EDTLast modified on Fri 20 Mar 2026 03.03 EDT An AI agent instructed an engineer to take actions that exposed a large amount of Meta's sensitive data to some of its employees, in the latest example of AI causing upheaval in a large tech company. The leak, which Meta confirmed, happened when an employee asked for guidance on an engineering problem on an internal forum. An AI agent responded with a solution, which the employee implemented - causing a large amount of sensitive user and company data to be exposed to its engineers for two hours.
- North America > United States (0.17)
- Europe > Ukraine (0.07)
- Oceania > Australia (0.05)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.97)
Google Shakes Up Its Browser Agent Team Amid OpenClaw Craze
As Silicon Valley obsesses over a new wave of AI coding agents, Google and other AI labs are shifting their bets. Google is shaking up the team behind Project Mariner, its AI agent that can navigate the Chrome browser and complete tasks on a user's behalf, WIRED has learned. In recent months, some Google Labs staffers who worked on the research prototype have moved on to higher-priority projects, according to two people familiar with the matter. A Google spokesperson confirmed the changes, but said the computer use capabilities developed under Project Mariner will be incorporated into the company's agent strategy moving forward. Google has already folded some of these capabilities into other agent products, including the recently launched Gemini Agent, the spokesperson added.
- North America > United States > California > San Francisco County > San Francisco (0.05)
- Europe > Slovakia (0.05)
- Europe > Czechia (0.05)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.90)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.76)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)
Policy Gradient With Value Function Approximation For Collective Multiagent Planning
Decentralized (PO)MDPs provide an expressive framework for sequential decision making in a multiagent system. Given their computational complexity, recent research has focused on tractable yet practical subclasses of Dec-POMDPs. We address such a subclass called CDec-POMDP where the collective behavior of a population of agents affects the joint-reward and environment dynamics. Our main contribution is an actor-critic (AC) reinforcement learning method for optimizing CDec-POMDP policies. Vanilla AC has slow convergence for larger problems. To address this, we show how a particular decomposition of the approximate action-value function over agents leads to effective updates, and also derive a new way to train the critic based on local reward signals. Comparisons on a synthetic benchmark and a real world taxi fleet optimization problem show that our new AC approach provides better quality solutions than previous best approaches.
Fully Decentralized Policies for Multi-Agent Systems: An Information Theoretic Approach
Learning cooperative policies for multi-agent systems is often challenged by partial observability and a lack of coordination. In some settings, the structure of a problem allows a distributed solution with limited communication. Here, we consider a scenario where no communication is available, and instead we learn local policies for all agents that collectively mimic the solution to a centralized multi-agent static optimization problem. Our main contribution is an information theoretic framework based on rate distortion theory which facilitates analysis of how well the resulting fully decentralized policies are able to reconstruct the optimal solution. Moreover, this framework provides a natural extension that addresses which nodes an agent should communicate with to improve the performance of its individual policy.
VAIN: Attentional Multi-agent Predictive Modeling
Multi-agent predictive modeling is an essential step for understanding physical, social and team-play systems. Recently, Interaction Networks (INs) were proposed for the task of modeling multi-agent physical systems. One of the drawbacks of INs is scaling with the number of interactions in the system (typically quadratic or higher order in the number of agents). In this paper we introduce VAIN, a novel attentional architecture for multi-agent predictive modeling that scales linearly with the number of agents. We show that VAIN is effective for multi-agent predictive modeling. Our method is evaluated on tasks from challenging multi-agent prediction domains: chess and soccer, and outperforms competing multi-agent approaches.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
We explore deep reinforcement learning methods for multi-agent domains. We begin by analyzing the difficulty of traditional algorithms in the multi-agent case: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. We then present an adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination. Additionally, we introduce a training regimen utilizing an ensemble of policies for each agent that leads to more robust multi-agent policies. We show the strength of our approach compared to existing methods in cooperative as well as competitive scenarios, where agent populations are able to discover various physical and informational coordination strategies.
Local Aggregative Games
Aggregative games provide a rich abstraction to model strategic multi-agent interactions. We focus on learning local aggregative games, where the payoff of each player is a function of its own action and the aggregate behavior of its neighbors in a connected digraph. We show the existence of a pure strategy epsilon-Nash equilibrium in such games when the payoff functions are convex or sub-modular. We prove an information theoretic lower bound, in a value oracle model, on approximating the structure of the digraph with non-negative monotone sub-modular cost functions on the edge set cardinality. We also introduce gamma-aggregative games that generalize local aggregative games, and admit epsilon-Nash equilibrium that are stable with respect to small changes in some specified graph property. Moreover, we provide estimation algorithms for the game theoretic model that can meaningfully recover the underlying structure and payoff functions from real voting data.
- Information Technology > Game Theory (0.85)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.61)
- Information Technology > Artificial Intelligence > Machine Learning (0.44)
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning
There has been a resurgence of interest in multiagent reinforcement learning (MARL), due partly to the recent success of deep neural networks. The simplest form of MARL is independent reinforcement learning (InRL), where each agent treats all of its experience as part of its (non stationary) environment. In this paper, we first observe that policies learned using InRL can overfit to the other agents' policies during training, failing to sufficiently generalize during execution. We introduce a new metric, joint-policy correlation, to quantify this effect. We describe a meta-algorithm for general MARL, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game theoretic analysis to compute meta-strategies for policy selection.